990 research outputs found

    Computational deconvolution to estimate cell type-specific gene expression from bulk data

    Get PDF
    Computational deconvolution is a time and cost-efficient approach to obtain cell type-specific information from bulk gene expression of heterogeneous tissues like blood. Deconvolution can aim to either estimate cell type proportions or abundances in samples, or estimate how strongly each present cell type expresses different genes, or both tasks simultaneously. Among the two separate goals, the estimation of cell type proportions/abundances is widely studied, but less attention has been paid on defining the cell type-specific expression profiles. Here, we address this gap by introducing a novel method Rodeo and empirically evaluating it and the other available tools from multiple perspectives utilizing diverse datasets.</p

    Estimating cell type-specific differential expression using deconvolution

    Get PDF
    When differentially expressed genes are detected from samples containing different types of cells, only a very coarse overview without any cell type-specific information is obtained. Although several computational methods have been published to estimate cell type-specific differentially expressed genes from bulk samples, their performance has not been evaluated outside the original publications. Here, we compare accuracies of nine of these methods, test their sensitivity to various factors often present in real studies and provide practical guidelines for end users about when reliable results can be expected and when not. Our results show that TOAST, CARseq, CellDMC and TCA are accurate methods with their own strengths and weaknesses. Notably, methods designed to detect cell type-specific differential methylation were comparable to those designed for gene expression, and both types outperformed methods originally designed for other tasks. The most important factors affecting the accuracy of the estimated cell type-specific differentially expressed genes are (i) abundance of the cell type (rare cell types are harder to analyze) and (ii) individual heterogeneity in the cell type-specific expression profiles (stable cell types are easier to analyze)</p

    RepViz: A replicate-driven R tool for visualizing genomic regions

    Get PDF
    Objective: Visualization of sequencing data is an integral part of genomic data analysis. Although there are several tools to visualize sequencing data on genomic regions, they do not ofer user-friendly ways to view simultaneously diferent groups of replicates. To address this need, we developed a tool that allows efcient viewing of both intraand intergroup variation of sequencing counts on a genomic region, as well as their comparison to the output of user selected analysis methods, such as peak calling. Results: We present an R package RepViz for replicate-driven visualization of genomic regions. With ChIP-seq and ATAC-seq data we demonstrate its potential to aid visual inspection involved in the evaluation of normalization, outlier behavior, detected features from diferential peak calling analysis, and combined analysis of multiple data types. RepViz is readily available on Bioconductor (https://www.bioconductor.org/packages/devel/bioc/html/RepViz.html) and on Github (https://github.com/elolab/RepViz).</p

    Soccer Team Vectors

    Full text link
    In this work we present STEVE - Soccer TEam VEctors, a principled approach for learning real valued vectors for soccer teams where similar teams are close to each other in the resulting vector space. STEVE only relies on freely available information about the matches teams played in the past. These vectors can serve as input to various machine learning tasks. Evaluating on the task of team market value estimation, STEVE outperforms all its competitors. Moreover, we use STEVE for similarity search and to rank soccer teams.Comment: 11 pages, 1 figure; This paper was presented at the 6th Workshop on Machine Learning and Data Mining for Sports Analytics at ECML/PKDD 2019, W\"urzburg, Germany, 201

    Comparison of methods to detect differentially expressed genes between single-cell populations

    Get PDF
    We compared five statistical methods to detect differentially expressed genes between two distinct single-cell populations. Currently, it remains unclear whether differential expression methods developed originally for conventional bulk RNA-seq data can also be applied to single-cell RNA-seq data analysis. Our results in three diverse comparison settings showed marked differences between the different methods in terms of the number of detections as well as their sensitivity and specificity. They, however, did not reveal systematic benefits of the currently available single-cell-specific methods. Instead, our previously introduced reproducibility-optimization method showed good performance in all comparison settings without any single-cell-specific modifications.</p

    Weight Loss Trajectories in Healthy Weight Coaching : Cohort Study

    Get PDF
    Background: As global obesity prevalence continues to increase, there is a need for accessible and affordable weight management interventions, such as web-based programs. Objective: This paper aims to assess the outcomes of healthy weight coaching (HWC), a web-based obesity management program integrated into standard Finnish clinical care. Methods: HWC is an ongoing, structured digital 12-month program based on acceptance and commitment therapy. It includes weekly training sessions focused on lifestyle, general health, and psychological factors. Participants received remote one-on-one support from a personal coach. In this real-life, single-arm, prospective cohort study, we examined the total weight loss, weight loss profiles, and variables associated with weight loss success and program retention in 1189 adults (963 women) with a BMI >25 kg/m(2) among participants of the program between October 2016 and March 2019. Absolute (kg) and relative (%) weight loss from the baseline were the primary outcomes. We also examined the weight loss profiles, clustered based on the dynamic time-warping distance, and the possible variables associated with greater weight loss success and program retention. We compared different groups using the Mann-Whitney test or Kruskal-Wallis test for continuous variables and the chi-squared test for categorical variables. We analyzed changes in medication using the McNemar test. Results: Among those having reached the 12-month time point (n=173), the mean weight loss was 4.6% (SE 0.5%), with 43% (n=75) achieving clinically relevant weight loss (>= 5%). Baseline BMI >= 40 kg/m(2) was associated with a greater weight loss than a lower BMI (mean 6.6%, SE 0.9%, vs mean 3.2%, SE 0.6%; P=.02). In addition, more frequent weight reporting was associated with greater weight loss. No significant differences in weight loss were observed according to sex, age, baseline disease, or medication use. The total dropout rate was 29.1%. Dropouts were slightly younger than continuers (47.2, SE 0.6 years vs 49.2, SE 0.4 years; P=.01) and reported their weight less frequently (3.0, SE 0.1 entries per month vs 3.3, SE 0.1 entries per month; P Conclusions: A comprehensive web-based program such as HWC is a potential addition to the repertoire of obesity management in a clinical setting. Heavier patients lost more weight, but weight loss success was otherwise independent of baseline characteristics.Peer reviewe

    Environmental Characteristics and Anthropogenic Impact Jointly Modify Aquatic Macrophyte Species Diversity

    Get PDF
    Species richness and spatial variation in community composition (i.e., beta diversity) are key measures of biodiversity. They are largely determined by natural factors, but also increasingly affected by anthropogenic factors. Thus, there is a need for a clear understanding of the human impact on species richness and beta diversity, the underlying mechanisms, and whether human-induced changes can override natural patterns. Here, we dissect the patterns of species richness, community composition and beta diversity in relation to different environmental factors as well as human impact in one framework: aquatic macrophytes in 66 boreal lakes in Eastern Finland. The lakes had been classified as having high, good or moderate status (according to ecological classification of surface waters in Finland) reflecting multifaceted human impact. We used generalized least square models to study the association between different environmental variables (Secchi depth, irregularity of the shoreline, total phosphorus, pH, alkalinity, conductivity) and species richness. We tested the null hypothesis that the observed community composition can be explained by random distribution of species. We used multivariate distance matrix regression to test the effect of each environmental variable on community composition, and distance-based test for homogeneity of multivariate dispersion to test whether lakes classified as high, good or moderate status have different beta diversity. We showed that environmental drivers of species richness and community composition were largely similar, although dependent on the particular life-form group studied. The most important ones were characteristics of water quality (pH, alkalinity, conductivity) and irregularity of the shoreline. Differences in community composition were related to environmental variables independently of species richness. Species richness was higher in lakes with higher levels of human impact. Lakes with different levels of human impact had different community composition. Between-lake beta diversity did not differ in high, good or moderate status groups. However, the variation in environmental variables shaping community composition was larger in lakes with moderate status compared to other lakes. Hence, beta diversity in lakes with moderate status was smaller than what could be expected on the basis of these environmental characteristics. This could be interpreted as homogenization

    PASI: A novel pathway method to identify delicate group effects

    Get PDF
    Pathway analysis is a common approach in diverse biomedical studies, yet the currently-available pathway tools do not typically support the increasingly popular personalized analyses. Another weakness of the currently-available pathway methods is their inability to handle challenging data with only modest group-based effects compared to natural individual variation. In an effort to address these issues, this study presents a novel pathway method PASI (Pathway Analysis for Sample-level Information) and demonstrates its performance on complex diseases with different levels of group-based differences in gene expression. PASI is freely available as an R package

    ROTS: An R package for reproducibility-optimized statistical testing

    Get PDF
    Differential expression analysis is one of the most common types of analyses performed on various biological data (e.g. RNA-seq or mass spectrometry proteomics). It is the process that detects features, such as genes or proteins, showing statistically significant differences between the sample groups under comparison. A major challenge in the analysis is the choice of an appropriate test statistic, as different statistics have been shown to perform well in different datasets. To this end, the reproducibility-optimized test statistic (ROTS) adjusts a modified t-statistic according to the inherent properties of the data and provides a ranking of the features based on their statistical evidence for differential expression between two groups. ROTS has already been successfully applied in a range of different studies from transcriptomics to proteomics, showing competitive performance against other state-of-the-art methods. To promote its widespread use, we introduce here a Bioconductor R package for performing ROTS analysis conveniently on different types of omics data. To illustrate the benefits of ROTS in various applications, we present three case studies, involving proteomics and RNA-seq data from public repositories, including both bulk and single cell data. The package is freely available from Bioconductor (https://www.bioconductor.org/packages/ROTS)
    corecore